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Abstract 

An analytical formula for the contributions of the trend leftovers in DFA method is presented, 
based upon which the crossovers in DFA are investigated in detail. This general formula can 
explain the calculated results with DFA method for some examples in Uterature very well. 
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I. Introduction 

Measurement records for a complex dynamical process form a long time series. And 
analysis of the time series will reveal the essential dynamical mechanics for the process, which 
in turn can provide us physical pictures and criteria to construct dynamical models. Many 
natural sequences have attracted special attentions recently. Typical examples include DNA 
sequences, weather and chmate records, heartbeat and gait series, price evolutions and 
information streams in Internet, etc. [1-11]. A common feature for all these sequences is that 
there exist a long-range correlation for the elements, that is to say, the correlation function 
obeys a power-law and there is not a finite correlation length. When we begin to study these 
admirably accurate data, almost immediately we encounter three roadblocks [5]. The first is that 
we must find the non-trivial self-similar properties instead of the trivial self-similar ones. 
Rescaling the t (time) dimension only and keeping the y (time interval) dimension invariance, 
we can obtain segments at different time scales. The segments have a self-similar property. But 
it is a trivial one that we are not interested in. The second is that environments always perturb 
measurements. And the perturbations can be described with white noise. What is more, 
statistical noise appears due to the finite number of measurements. The third one is that these 
data have the property of non-stationary. This means that the statistical properties of a time 
series are not constant in time; they are not served up neatly as independent, identically 
distributed random variables. Traditional approaches such as the power-spectrum and 
correlation analysis are not suited to accurately quantify long-range correlations in this kind of 
non-stationary signals. Hence, non-stationary is the essential problem to be resolved [12-15]. 

In literatures [16], DFA method is designed to solve these problems. DFA is a scaling 
analysis method providing a simple quantitative parameter — the scaUng exponent a — to 
represent the correlation properties of a time series. The advantage of DFA over many methods 
are that it permits the detection of long-range correlations embedded in seemingly 
non-stationary time series, and also avoids the spurious detection of apparent long-range 
correlations that are artifact of non-stationary. In the past few years, more than 100 pubhcations 
have utiUzed the DFA as method of correlation analysis, and fruitful results are achieved in 
many research fields such as cardiac dynamics, bio-informatics, economics, meteorology, 
geology, etc. [17-19] 

2 



The correct interpretation of the scahng results obtained by the DFA method is crucial for 
understanding the intrinsic dynamics of the systems under study. In fact, just as pointed out in 
references [17-19], for all systems where the DFA method was apphed, there are many issues 
that remain unexplained. One of the common challenges is that the correlation exponent is not 
always a constant (independent of scale) and crossovers often exist. A crossover usually can 
arise from a change in the correlation properties of the signal at different time/space scales, or 
can often arise from trends in the data. Though DFA-n method can ehminate trends up to 
n -order, the leftovers may generate artifact long-range correlations, which in turn can induce 
crossovers. In this paper we present a general formula for the leftovers' contribution in DFA 
method. As an example, a leftover with sinusoidal form is investigated in detail. The results can 
explain the calculations presented in reference [18,19] exactly. 

II. DFA method and the contribution of leftovers 

1. DFA method 

Normally, to find the non-trivial self-similar properties and reduce the perturbations due to 
noises, an integrated time series is introduced instead of investigating the initial time series 
directly. The integrated time series can be constructed as follows, 

m 

Y =yAT. 

m I 

Where {F^ |m = 1,2,3,. . . } forms the integrated time series, called "profile", and 

{iiT, |/ = 1,2,3,...} is the initial time series. 

To avoid spurious detection of correlations due to artifact of non-stationary, DFA method is 
suggested to calculate the time-dependent fluctuation function. The DFA procedure consists of 
four steps, as illustrated below [19], 
Step.l Construct the profile as described above. 

Ste/;.2 Cut the profile Y{i) into A^, =[A^/*] non-overlapping segments of equal length s . 

Since the record length A'^ need not be a multiple of the considered time scale s , a short part 
at the end of the profile will remain in most cases. In order not to disregard this part of record, 

the same procedure is repeated starting from the other end of the record. Thus, 2A^^ segments 
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are obtained altogether. 

Step.3 Calculate the local trend for each segment V by a least-squares fit of the data. Then 
define the detrended time series for segment duration s , denoted by (/) , as the difference 

between the original time series and the fits: (i) = Y(i) — (i) , where P^, (i) is the fitting 

polynomial in the y 'th segment. Linear, quadratic, cubic or higher order polynomials can be 
used in the fitting procedure (called DFA-1, DFA-2, and DFA-n respectively). Since the 
detrending of the time series is done by subtraction of the fits from the profile, these methods 
differ in their capabihty of ehminating trends in the data. In DFA-n trends of order n in the 
profile and of n — l in the original record are ehminated. Thus, a comparison of the results for 
different orders of DFA allows estimating the strength of the trends in the time series. 

Step.4 Calculate the variance for each of the 2A^^ segments: 



Fs (V) = {y^ ii)) = -t.Yl [(V -l)s + n. 



of the detrended time series Y^ (i) by averaging over all data points / in the V 'th segment. 

Finally, average over all segments and take the square root to obtain the DFA fluctuation 
function: 



F{s) = 



2N 



i:^/(v) 



For different detrending orders n we can obtain different fluctuation functions, denoted by 

It can be proved that if the initial data {Xj |/ = 1,2,. ..A^} are long-range power-law correlated, 
e.g., C(s) = (xfXf^^) oc s'''' , the fluctuation function F^"\s) increase by a power-law. 



F^"\s)o^s ^ = 5" , for large s values. 

As a brief argument we consider a time series without trends and zero offset. Then the 
mean-square displacement in each segment V can be calculated: 

Y\i)) = (L^/ ) + ( L-^,-^. ) = £C(|/: - j\) = i{x') + 2f^(i-k)C(k). 

\k=l I \k*j / k*j k^l 



And C(k) is the autocorrelation function: C(s) = (^x^x^_i.^j <x s ^ . 
For large / , the second term can be approximated: 

And 

If the data are long-range power-law correlated with < 7 < 1 , this term will dominate for 
large / , giving: 

A similar approximation for F (s) leads to: 

1. The contribution of the leftovers 

The detrending procedure in DFA method can eliminate the trends effectively, but it cannot 
separate the dynamical fluctuations from the trends exactly. Actually, the detrended profile 
contains two parts. One part is the dynamical fluctuations and the other part is the leftovers of 

the trends, T{i),i = 1,2,. .N. 

A good characteristic for the leftovers is that they obey a deterministic function, e.g., they are 
correlated completely. What is more, because the two parts are independent on each other, the 
couphng between them should be zero. Denoting the integrated true trend, the eliminated trend 

in DFA method, the integrated leftovers and the integrated dynamical fluctuations with Tj (i) , 

i 

Tjj (i) , ATj (i) = Tj (i) — Tj^ (i) = ^ T{n) and F{i) respectively, the calculated result with 

DFA can be obtained as, 

= ^((Ar, (/) + F(i)f) = V< AT, (if > + < F(i)' >+2< AT, (i) ■ F(i) > 

= 7< Ar, > + < F(iy >+2< ATj (i) > ■ < Fii) > 
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= ^< ATjiif > + <F(i)' > 
That is to say, the calculated result with DFA contains two separate contributions. One part is 
from the dynamical fluctuations and the other part is from the leftovers. 

The deterministic trend with length N (e.g., leftovers), denoted with r(x)|x = 1,2,. ..N here, 

can be extended periodically to the whole space as, 

G(x) = G(x + mN) = T(x)\m = 0+1+2,...;! < x < 

f A ] 
Accordingly, G(x) can be expressed with the set e ^ \n = 1,2,3, ...N> , which reads. 



G(x) = Y,a„cxp(i-^x) 



The autocorrelation function for the trend G(x) can be defined as, 

C(T) = c-[ G(xy • G(x + T)dxocj^f^aJa„ [ exp(-/ • ^ x) exp(^ x + ^T)dx 

n=l m=l 

Y.a;aJ(n-m)cxp(i-^t) = Y,\a„ \' -expO-^T). 

n=l m=l n=l 

Therefore, 

((Ar(T)) ' ) = t(g(x) ') + 2Y^(z-k)C(k). 
The first term is, 

x{G{xf) = X-Yja, l'=q-T 
The second term is. 



T-l 



T-1 



2j^(7:-^)C(fc) = 2- J^(r-^)J^la„ 1^ -exp 



InJt 
N 



•k 



= 2-J^la„ 1^ J^(T-/:)-exp 



«=1 i=l 



. 2n7t , 

I k 

N 



T-l / 

J^exp 



. 2n7t , 

I k 

N 



I 1 

- ^ /: • exp 



^ . 2n% ,^ 

I k 

V ^ J 



For large T we have, 



J^T-exp 



Inn 
N 



•k 



exp 



2n7t 
N 



• k m 



^ A: exp i ■ ' ^ 1 ~ £ ^ ^^P 
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InTt 
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dj^ exp 
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-exp 
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-exp 



2n7t 
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-exp 



2nK 



N 
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Hence, the second term can also be expressed as. 



T-1 



2Y^{T-k)C{k)ocY\a^ ? 



■N 



2n7i: 



N 

2nK 



J - exp 



. 2nn 
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A^ 
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■exp 
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The autocorrelation function should be real niunber. Therefore we have the following relation, 
(AT(T)f ) = t(G(x)' ) + 2^ (T - k)C(k) 



(Ar(T))') = qT + c,-£la„ P 



_N_ 
2n7t 



-sin 



2n7r 
N 



(T-1) 



^(T-1) 



2n7i: 



-sin 



^ 2n7r'^ 



+ c 



{ ^ ^ 


( 




cos 


^2n7t ^ 


V 



2n7t 



(T-1) 



^ ^ A^ Y (2n7t'^ 
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2mt 



cos 



A^ 



The final relation between the statistical quantity in DFA method and the size of the 
window T can be written as, 



«=i 



_N_ 
2mt 



-sin 



2nK 

~N~ 



(T-1) 



A^(T-l) . 

sin 

2nK 
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N 
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2nJt 
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^ ^ A^ Y (2mt'^ 



+ 



2mt 



cos 
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We can find that the contributions of the leftovers have a complex form. 



III. A Typical Example 

As an example, we consider a time series of correlated noise with sinusoidal trend. The 
leftovers can be expressed as a sinusoidal form. The frequency and the amplitude of the trend 

can be written as ^ and A^, e.g., a„ = 5(n — q)A^. The statistical quantity in DFA 



method can be reduced to a simple form as. 



N . ^ 
sin 



2q7t 



A^(T-l) . 

-sin 



N 



COS 



IqJt 



2q7l 



(T-1) 



+ 



_N_ 
lq% 



cos 



Iqn ^ 



The competition between the effects on of the sinusoidal signal and the correlated noise 

determines this statistical quantity's scahng behaviors. 

When the window size T is large enough the effects of triangle terms can be neglected due to 
their oscillating in a special range. And the effect of the linear term can also be neglected 
comparing with the term c^X^" . Therefore, we can expect that the scahng behavior for A^ 
versus T should be same with that for pure dynamical fluctuations versus T , e.g., 
A^ ~ c ^" in the scale range with large X . 



On the contrary, in the scale range with small X (here we consider the condition 



IqTT 



T«l) 



we can expand the triangle terms into Taylor series and neglect the high order terms. We can 
find an approximation for the relation between A^ versus X as. 



A^ = CqT + CjX +A^ (c,X + ...) . The relation F(x) ~ CfX may also be dominant in 
this condition. 

In the intermediate range of the window size X , the terms in the original relation A^ 

versus X can be all dominant. 
Varying the frequency, e.g., q value and the power-law exponent (X , we present the results 



for the relation between A versus X in Fig.(la), Fig.(lb), Fig.(2a) and Fig.(2b). We can 



find that there are three crossovers in the curves versus T , denoted with , n2;c ^^'^ 



n^^ here, respectively. These crossovers separate the whole range of the window size into four 

parts, denoted with /, //, /// and IV in the figures. The contributions of those triangle terms 
oscillate with a period, which is same with that of the initial sinusoidal trend. From the figures 

we can find that the crossover n2j^ is actually correspondent with the half of the first period. 

Clearly should be T 12 = — almost exactly. 

Iq 

As for the rii^ , because it appears in the scale range with small T , its position should be the 



intercept of the two terms CoT + c^T^" and c, ■ A^'^ ■t'' + c, • A 



2 _2 



V N J 



T +... in 



= CqT + CyT^" + Cj • A^ -T^ +.... Assuming that the terms CfT^" and c, 



V N J 



are dominant respectively in the two terms, we can obtain an estimation of the position of the 



first crossover as, n 



Ix 



' N ^ 



2 A 



l/4-2a 



In our results 



presented here, however, the position of n^^ is different with that presented in hterature. 

The third crossover appears in the scale range with large T , where two parts may be 



dominant. One part is the term CfT and the other part is the triangle terms. The amphtude 



of the triangle terms is A 



' N ^ 



Hence n^^ ~ 



\l/a 



lla 



Actually, a DFA-n procedure can ehminate the trends up to (n-1) order, and the leftovers can 
be a sumnoary of terms with orders n, n + 1,... . Assuming the term with order n is dominant 

in the summary, results for high order DFA-n can also be obtained, which is also similar with 
that presented in reference [18-19]. 

Comparing Fig.(la) and Fig.(2a) with Fig.(lb) and Fig.(2b) respectively, we can also find that 
the effect of a trend with low frequency is much larger than that of a trend with high frequency. 
The results above are consistent with the discussions in references [18-19] almost exactly. But 
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from the figures we can find that there are some basically difl'erences comparing with the results 
in these two references. In the range T < , the curve for the term obeying power-law and 

the curve for the total statistical quantity are not coincident with each other, though they can 
have a similar scahng behavior. While the two curves presented in reference [18] are coincident 

with each other at all. In the range < T < n^^ , our results possess essentially an oscillating 
behavior. However, the curves in reference [18] exhibit a flat region for n2j, <t <n^^. 

IV. Conclusions 

In summary, based upon random walking theory we present a general formula for the 
leftovers' contributions in DFA method. A time series of correlated noise with sinusoidal 
trend is investigated in detail. This formula can explain the calculated results in hteratures 
very well. It may also be helpful for us to understand the DFA results theoretically and 
estimate the leftovers in DFA analysis procedure. What is more, this formula may provide us 
useful information to find effective tools for time series analysis. 
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Contributions of the correlated noise with a = 0.75 and the trend leftover with 
sinusoidal form. Here the parameters (cq ,Cf,c,) are set to be (1,1,1) . The 
ampUtude of the leftover is 2. The frequency of the leftover is 0.01. 




Contributions of the correlated noise with cc = 0.75 and the trend leftover 
with sinusoidal form. Here the parameters (cq ,Cf,c,) are set to be (1,1,1) . The 
amphtude of the leftover is 2. The frequency of the leftover is 0.001. 
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Contributions of the correlated noise with a = 0.90 and the trend leftover 
with sinusoidal form. Here the parameters (cq , , Cj ) are set to be (1,1,1) • The 
amphtude of the leftover is 2. The frequency of the leftover is 0.01. 




Contributions of the correlated noise with a = 0.90 and the trend leftover 
with sinusoidal form. Here the parameters (cq , Cy , c, ) are set to be (1,1,1) . The 

amphtude of the leftover is 2. The frequency of the leftover is 0.001. 
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